Estimation of true audience size for digital content

ABSTRACT

A content server system provides a client device with content, such as an audio stream. Using various techniques, an estimate is made of the actual size of an audience associated with the provided content, rather than assuming that the audience is limited to a single user of the client device. The estimate may be made by the content server system, the client device, or the content server system and the client device collectively. Using the estimate of actual size of the audience, the content server system can take actions appropriate for the audience, such as providing advertisements appropriate for the size of the audience and for the collective characteristics of the audience members. The estimate of the actual audience size additionally allows the content server system to be compensated more precisely for any advertisements provided to that audience.

BACKGROUND

1. Field of Art

The present invention generally relates to the delivery of content andassociated advertising, and more specifically, to ways of estimating thetrue size of an audience listening to or otherwise experiencing thedigital content.

2. Background of the Invention

Providers of digital content may generate revenue by insertingadvertisements into the digital content and receiving payment fromadvertisers according to a cost-per impression (CPM) payment model. Forexample, the providers may insert audio advertisements between songs inan audio content stream provided to client devices. For purposes of theCPM model, the traditional assumption is that there is only one audiencemember for each audio stream provided by the provider to a clientdevice, and the advertisers therefore only credit the provider with asingle impression each time that an advertisement is inserted into anaudio stream.

In reality, however, many people may be listening to or otherwiseexperiencing a single audio stream. For example, at a social event suchas a gathering of friends in a home, there will typically be multiplepeople within hearing distance of the sound output device playing theaudio stream, e.g., within a room. Thus, due to the lack of ability toaccurately estimate the true audience size for an audio stream, theaudio provider is credited only for a single impression for an adprovided on the audio stream, even when there are multiple peoplelistening. This leads to a significant loss of potential revenue for theaudio providers. Additionally, without having a more accurate estimateof the size and composition of the audience, it is difficult to providethe advertisements or other content that are most appropriate for theaudience as a whole.

SUMMARY

In one embodiment, a computer-implemented method comprises identifying astreaming client device receiving streamed audio; determining a numberof inferred listeners that are listening to the streamed audio whenoutput by the streaming client device, the inferred listeners being inaddition to a user using of the streaming client device; including anadvertisement in the streamed audio provided to the streaming clientdevice; and logging an impression count responsive to the inclusion ofthe advertisement, the impression count reflecting the determined numberof inferred listeners.

In one embodiment, a computer-implemented method performed by a clientdevice comprises receiving streamed audio from a content server system;outputting the streamed audio; determining a number of inferredlisteners that are listening to the streamed audio output by the clientdevice, the inferred listeners being in addition to a user using theclient device; and sending, to the content server system, an audiencecount that is based on the inferred number of listeners.

In one embodiment, a computer-readable storage medium comprises computerprogram instructions executable by a processor. The instructionscomprise instructions for identifying a streaming client devicereceiving streamed audio; instructions for determining a number ofinferred listeners that are listening to the streamed audio when outputby the streaming client device, the inferred listeners being in additionto a user using of the streaming client device; instructions forincluding an advertisement in the streamed audio provided to thestreaming client device; and instructions for logging an impressioncount responsive to the inclusion of the advertisement, the impressioncount reflecting the determined number of inferred listeners.

The features and advantages described in the specification are not allinclusive and, in particular, many additional features and advantageswill be apparent to one of ordinary skill in the art in view of thedrawings, specification, and claims. Moreover, it should be noted thatthe language used in the specification has been principally selected forreadability and instructional purposes, and may not have been selectedto delineate or circumscribe the inventive subject matter.

BRIEF DESCRIPTION OF DRAWINGS

FIG. 1 illustrates a computing environment in which audio streaming andaudience size estimation take place, according to one embodiment.

FIG. 2A is a high-level block diagram illustrating a detailed view of aclient device of FIG. 1, according to one embodiment.

FIG. 2B is an example user interface provided by the application ofFIGS. 1 and 2A, according to one embodiment.

FIG. 3 is a high-level block diagram illustrating a detailed view of thecontent server system of FIG. 1, according to one embodiment.

FIG. 4 is a sequence diagram that illustrates interactions of the serversystem, client device, and advertiser of FIG. 1 during the overallprocess of providing, serving, tracking, and reporting onadvertisements, according to one embodiment.

FIGS. 5A-5D illustrate the interactions involved in a number ofdifferent techniques for estimating the number of inferred listeners inan audience, according to various embodiments.

FIG. 6 is a high-level block diagram illustrating physical components ofa computer 600 used as part or all of the content server system orclient device from FIG. 1, according to one embodiment.

The figures depict embodiments of the present invention for purposes ofillustration only. One skilled in the art will readily recognize fromthe following description that alternative embodiments of the structuresand methods illustrated herein may be employed without departing fromthe principles of the invention described herein.

DETAILED DESCRIPTION

FIG. 1 illustrates a computing environment in which the contentprovision and audience size estimation take place, according to oneembodiment. A content server system 100 provides digital content toclient devices 110. The content server system 100 also insertsadvertisements from one or more advertisers 120 into the contentprovided to the client devices 110, and is paid by the advertisersaccordingly based on a cost per impression (CPM) payment model.

In one particular embodiment referred to throughout the remainder of thespecification, the content server system 100 provides streamed audiocontent, such as songs, pieces of music, or audio recordings. It isappreciated, however, that in other embodiments the content serversystem 100 could alternatively and/or additionally provide other formsof digital content, such as videos, movies, slideshows, images, ornon-streamed audio. Thus, subsequent references to “listening” or otheraudio-related terminology could equally apply to (for example) viewingvideos or otherwise experiencing media provided by the content serversystem 100 in other embodiments.

As described in the remainder of the specification, the content serversystem 100 and the client devices 110 collectively estimate how manyother people (equivalently, “users”) are listening to a stream, inaddition to the person whose client device 110 is performing thestreaming. The person whose device is performing the streaming—and whopresumably is listening to the stream—is referred to hereinafter as the“direct listener” for the stream, and the other people estimated also tobe listening to the stream on the streaming client device 110 arereferred to hereinafter as “inferred listeners” for the stream. The“audience” for a particular stream provided to particular client device110 consists of the direct listener and the inferred listeners (if any).(Note that the inferred listeners for a stream—or more generally, forcontent—are limited to those who hear or otherwise experience thecontent when output at the particular client device. Thus, for example,the inferred listeners for content output at a particular client devicedo not also include those hearing the content output by a differentclient device at a different location, such as those in another city.)For example, if there were five friends listening together to a streamthat one of them initiated on his device, there would be one directlistener and four inferred listeners (presuming that the estimationtechniques accurately identified all four of the other friends). Thus,in this example, the estimated audience size would be five: one directlistener and four inferred listeners. An advertisement provided to anentire audience defined by a stream results in one “direct impression”corresponding to the direct listener, and one “indirect impression” foreach inferred listener. In the above example, for instance, providing anadvertisement audible to the entire audience would result in one directimpression and four indirect impressions.

The client devices 110 are computing devices such as smartphones with anoperating system such as ANDROID or APPLE IOS, tablet computers, laptopcomputers, desktop computers, electronic stereos in automobiles or othervehicles, or any other type of network-enabled device on which digitalcontent may be listened to or otherwise experienced. Typical clientdevices 110 include the hardware and software needed to input and outputsound and images (e.g., speakers and microphone), connect to anelectronic network (e.g., via Wifi and/or 4G or other wirelesstelecommunication standards), determine the current geographic locationof the client devices (e.g., a Global Positioning System (GPS) unit),and/or detect motion of the client devices (e.g., via motion sensorssuch as accelerometers and gyroscopes).

The client devices 110 may have an application 111 that allowsinteraction with the content server system 100. For example, theapplication 111 could be a browser that allows a user of the clientdevice 110 to obtain content by browsing a web site of the contentserver system 100. As another example, the application 111 could be adedicated application specifically designed (e.g., by the organizationresponsible for the content server system 100) to enable interactionswith the content server system 100 and its content. The application 111on a particular client device 110 may be associated with a user of theclient device 110 (e.g., via a one-time registration, or a username andpassword pair or other credentials). When the application 111 isassociated with a user, the application can store or otherwise gainaccess to the user's past listening history, demographic data about theuser (either expressly provided by the user, or inferred based onfactors such as listening history, geographic location, name, and thelike) and use this information to provide content and advertisementsthat are most likely to be appreciated by that particular user. Inaddition to allowing a user to explicitly obtain content from thecontent server system 100, the application 111 may also implicitlyprovide the content server system 100 with data about status and use ofthe client device 110, such as its network ID, geographic location,physical movement, and/or sound input, although in some embodiments theuser of the application may elect to disable this feature.

The content server system 100 and the client devices 110 are connectedvia a network 140. The network 140 may be any suitable communicationsnetwork for data transmission. The network 140 uses standardcommunications technologies and/or protocols and can include theInternet. In another embodiment, the network 140 includes custom and/ordedicated data communications technologies.

The client device 110 and content server system 100 are now described inmore detail with respect to FIGS. 2 and 3, below.

FIG. 2A is a high-level block diagram illustrating a detailed view of aclient device 110 of FIG. 1, according to one embodiment.

A client device has a set of sensors 215 that collect data associatedwith properties of the client device 110, such as data about thephysical environment or state of the client device. Different types ormodels of client devices may have different sensors. Illustrated in theembodiment of FIG. 2 are a set of sensors 215 particularly appropriatefor smartphone client device, though it is appreciated that other clientdevices may have different sensors.

The illustrated sensors 215 include a movement detection sensor 216,which detects properties of movement of the client device such as speed,acceleration, or direction. The movement detection sensor 216 mayinclude accelerometers or gyroscopes, for example. Another illustratedsensor is the geolocation sensor 217, which determines a particulargeographic location of the client device 110, such as coordinatesprovided by Global Positioning System (GPS) or other geographic locationsystems. Another illustrated sensor is the audio input sensor 218, suchas a microphone, which detects and measures sound. Another illustratedsensor is a network sensor 219, which identifies a network(s) that theclient device 110 is currently using for communication, such as a Wifinetwork, or a 4G other telecommunication network.

The client device 110 may also include an application 111 specificallydesigned to operate with the content server system 100. For example, inone embodiment the application includes a user interface 250 forinteracting with an audio stream, as illustrated in FIG. 2B. The exampleuser interface 250 includes a description area 251 providing informationon a currently-playing song, an optional image advertisement 252,controls 253 for registering appreciation for, or dislike of, the songcurrently playing, and to pause/play or skip the current song. Theexample user interface 250 also includes a set of options 255 (shown inresponse to selection of popup control 254) that include an option 256to request an audio stream that includes songs (tracks) associated witha particular artist, genre, or the like, and an option 257 to share astream with other nearby users of the content server system 100 so thatthe other users can also (for example) react to the currently-playingsong, such as registering appreciation for, or dislike of, the song,sharing the song, bookmarking the song, or the like.

Returning again to FIG. 2A, in some embodiments the application 111includes a stream sharing module 260 that, when requested by a user(e.g., via the option 257 of FIG. 2B), makes the currently-playingstream accessible to others, and (for applications 111 on client devicesnot currently streaming) allows their users to interact with the stream.For example, on the client device 110 doing the streaming, the streamsharing module 260 may broadcast the availability of the stream to otherclient devices 110 using short-range wireless communications; and onclient devices located nearby, the sharing module 260 may note theavailability of the stream in the user interface, and in response to theuser accepting the shared stream, show the playing stream along withcontrols allowing the user to comment on or otherwise interact with thestream.

In some embodiments, the application 111 also includes an audience sizeestimation module 262 that estimates (possibly in cooperation with thecontent server system 100) an actual size of the audience for thestream, including a number of inferred listeners, in addition to thedirect listener of the client device 110 receiving the stream from thecontent server system. Details on the various techniques used by theaudience size estimation module 262 are provided below with respect tothe operations of FIGS. 5A-5D. The application 111 may use the audienceestimation module 262 multiple times while the current audio stream isbeing provided in order to keep the content server system 100 updatedwith the most accurate audience size estimation. For example, in oneembodiment the audience size estimation module 262 is used directlybefore an advertisement is inserted into the audio stream; in otherembodiments, it may be used at predetermined time intervals, such asevery 2 minutes.

In one embodiment, the application 111 also includes an audiencereporting module 264 that provides the estimates generated by theaudience estimation module 262 to the content server system 100. Forexample, if the audience estimation module 262 within the application111 on a particular client device 110 estimates that there are fivepeople in the audience for an audio stream received at that clientdevice, then it notifies the content server system 100 that thatparticular audio stream/client device has an audience that includes fourinferred listeners, in addition to the one direct listener.

FIG. 3 is a high-level block diagram illustrating a detailed view of thecontent server system 100 of FIG. 1, according to one embodiment.

The content server system 100 includes a content provision module 305that provides requested content to the application 111 of the requestingclient device 110. For embodiments in which the requested content typeis audio streams, for example, the content provision module 305initiates a stream of the requested audio, streaming the audio to theclient device 110 over time.

The content server system 100 also includes an audience size estimationmodule 310 that estimates (possibly in cooperation with the application111 on the client devices 110) an actual size of the audience for eachof the various streams currently being provided, including a number ofinferred listeners. Details on the various techniques used by theaudience size estimation module 310 are provided below with respect tothe operations of FIGS. 5A-5D.

It is appreciated that the various techniques of FIGS. 5A-5D are notexhaustive, and that other techniques, or variations on thesetechniques, are also possible. For example, in some embodiments othersensor data from the client devices 110 are also taken into account. Forinstance, in one embodiment sensors such as the status of the audioheadphone output are used to determine whether the provided content istruly audible to more than the streaming user, or whether (in contrast)the audio is being provided to an output audible by only one listener.If the audio headphone output indicates that the sound is being outputexclusively to headphones, for example, then it is assumed that thereare no inferred listeners, since only someone with the headphones wouldhear the audio. As another example, in some embodiments the status ofdirect device links such as Bluetooth or similar proprietary protocolsis used to infer whether the audio is being provided to an output device(e.g., a home or car stereo) that typically would be audible by multiplepeople.

The audience estimation module 310 may be used at many different timesfor a particular stream in order to maintain an accurate audience sizeestimate. For example, in one embodiment the audience size estimationmodule 310 is used directly before an advertisement is inserted into theaudio stream; in other embodiments, it may be used at predetermined timeintervals, such as every 2 minutes.

The content server system 100 also includes an audience characteristicsdetermination module 315 that determines characteristics of the audienceas a whole (in addition to its size as estimated by the audience sizeestimation module 310), based on the characteristics of the individualpeople in the audience. The characteristics of the direct listener canbe obtained using the application 111 of the client device 110 to whichthe content (e.g., audio stream) is being provided, assuming that thedirect listener has registered with the application 111. Additionally,inferred listeners may have client devices 110 with their ownapplications 111 that are in communication with the content serversystem 100, even though those applications 111 are not actively beingused to stream the audio or otherwise obtain the content. In such acase, the applications 111 of the client devices 110 of the inferredusers may be able to provide information about the inferred users.

The characteristics that can be determined include characteristicsderived from the client device 110 itself, such as client device type(e.g., screen size and resolution); geolocation (e.g., GPS coordinates)of the client devices 110 of the users in the audience; a moresemantically-meaningful description of the geolocation (e.g., “in aprivate home” or “in a public place”); and current network being used tocommunicate (e.g., Wifi, or 4G LTE). The characteristics also includeinformation related to use of the content server system 100, such as thepast listening history (e.g., songs, artists) of the users of the clientdevices 110, and demographic data about the users. The demographic datamay be provided explicitly by the users when using the applications 111,or it may be inferred from other factors, such as inferring that aparticular user is American in response to the user typically listeningto English-language programs at locations within the United States.

The content server system 100 additionally includes an ad selectormodule 320 that selects one or more advertisements to include within theprovided content (e.g., an audio stream). In one embodiment, the adselector module 320 may additionally select advertisements that arepresented outside of the provided content, such as display imageadvertisements presented visually on screens of client devices 110,rather than audio advertisements inserted into audio content. In oneembodiment, the advertisements are selected from an advertisementsrepository 301 to which the various advertisers 120 submitadvertisements for potential inclusion within content.

Advertisements which, when output on the client device 110 of the directlistener, can be heard or otherwise experienced by others in theaudience (in addition to the direct listener) are referred to as“audience-wide” advertisements, and those that are intended to beexperienced only by a single member of the audience at a time arereferred to as “individual user” advertisements. For example, audioadvertisements played through the speakers of the client device 110 ofthe direct listener are audience-wide advertisements, and display imageadvertisements individually selected for, and provided to, the clientdevices 110 of individual users in the audience are individual useradvertisements. Visual advertisements provided to a client device thatcould have multiple viewers (e.g., a TV screen) may also be consideredaudience-wide advertisements.

The audience-wide advertisements are selected—based on the audiencecharacteristics provided by the audience characteristics determinationmodule 315—to be appropriate not only for the direct listener to whoseclient device 110 the content is being provided, but also for theinferred listeners (if any) within the audience for the content that isoutput by the client device 110. For example, the content server system100 may select the advertisements based on the language(s) spoken by themembers of the audience (both the direct and inferred listeners),whether explicitly specified or inferred; the gender(s) of the membersof the audience; any known interests or preferences of the members ofthe audience; or any other known characteristic of relevance. Forexample, if the ad selector module 320 determines that there aremultiple languages spoken by the audience, it selects advertisementsthat are in the language of the country to which the content is beingprovided, since that is the presumed lingua franca generally understoodby the audience.

The ad selector module 320 may also use other non-audience-specificfactors when selecting the advertisement(s), such as any preferencesspecified by the advertisers 120 who submitted the ads to theadvertisements repository 301 (e.g., some advertisers may only wishtheir advertisements to be provided to audiences with specifiedproperties); the advertisement mix model of the content server system100, which specifies the order in which advertisements with differentproperties (e.g., from different advertisers, with different subjectmatter) should be provided so as to maximize user interest; and anyadvertising auction factors, such as preferring advertisements for whichthe advertisers are willing to pay higher per-impression fees. Anotherfactor that the ad selector may take into account is whether a givenadvertiser has had its specified minimum number of direct impressionsprovided yet. For example, assume that a particular advertiser hasspecified that its advertisements should collectively receive 10,000direct impressions per month; that those advertisements have onlyreceived 8,000 direct impressions thus far; and that it is late in themonth, so that it is unlikely that the advertiser's advertisements willreceive the specified 10,000 direct impressions by the end of the month.In such an example, the ad selector module 320 may tend to provideadvertisements of that advertiser to audiences with large numbers ofinferred listeners, so that a larger number of indirect impressions mayhelp to offset the failure of the advertisements to receive thespecified number of direct impressions.

If the client devices 110 of the various members of the audience are incommunication with the content server system 100, the ad selector module320 may also select different individual user advertisements for displaywithin the applications 111 of those client devices. For example, inaddition to providing an audio advertisement to the client device 110 ofthe direct listener within an audio stream, the ad selector module 320may also provide display image advertisements (for example) on thescreens of the other client devices, so that those users can see theadvertisements while they are listening to the audio stream playing onthe client device of the direct listener. Those additional display imageadvertisements may be selected based on the known characteristics of therespective users, such as their languages, locations, nationalities,prior listening histories, and the like.

The content server system 100 additionally includes an ad serving module325 that provides the selected advertisements to the client device 110receiving the content. In embodiments in which the content is streamingaudio, for example, the ad serving module 325 inserts audience-wideaudio advertisements into the audio stream provided to the client device110 of the direct listener. The ad serving module 325 content system mayalso provide additional individual user advertisements to audiencemembers other than the direct listener, e.g., by sending display imageadvertisements to any client devices 110 of the inferred listeners thatare in communication with the content server system 100.

The content server system 100 additionally includes an impressionlogging module 330 that adds additional impression counts (stored in anad impressions repository 302) based on the providing of the selectedadvertisements and the audience size estimation. The impression countsreflect the number of impressions associated with the providing of an adto client devices 110, including direct impressions and indirectimpressions. For example, the ad impression logging module 330 logs(N+1) additional impressions for each audience-wide advertisementprovided to the client device 110 of the direct listener: N indirectimpressions (assuming N inferred users were identified), and one directimpression (corresponding to the direct listener). An impression countmay be logged in different ways in different embodiments, such as onenumber for the sum of the direct and indirect impressions, and one forthe number of indirect impressions; solely the number of indirectimpressions (from which the sum is derivable, given that there will beone indirect impression); or the like. The ad impression logging module330 also logs one additional impression for each individual useradvertisement provided to any of the audience members. Payments to thecontent server system 100 by the advertisers 120 are then based on thelogged impressions in the ad impressions repository 302.

The content server system 100 additionally includes a reporting module340 that provides a report of advertising statistics to advertisers 120.The report for a given advertiser includes the total number ofimpressions for the various advertisements of the advertiser over sometime period, as logged by the impression logging module 330 whenupdating the impressions statistics. In one embodiment, the impressionsin the report include the total number of impressions (collectively,and/or for individual advertisements), as well as the number of directand indirect impressions that constitute the total. In variousembodiments, the report is provided in response to an explicit requestof the advertiser 120 (e.g., logging into an account of the advertiseron the content server system in order to see a report of advertisementstatistics in a user interface); in other embodiments, the report isperiodically sent to the advertiser, e.g., in a summary email.

FIG. 4 is a sequence diagram that illustrates interactions of the serversystem 100, client device 110, and advertiser 120 during the overallprocess of providing, serving, tracking, and reporting onadvertisements, according to one embodiment.

Advertisers 120 contribute 405 advertisements to the content serversystem 100 for potential inclusion by the content server system withinthe content (e.g., streamed audio) provided to clients, and the contentserver system stores the advertisements in an advertisements repository301 of candidate advertisements. In other embodiments, the contentserver system 100 does not maintain an ad repository, but insteaddynamically obtains advertisements from the advertisers 120 at the timethe ads are served. The various advertisers may specify, as part ofagreements with organization responsible for the content server system100, that their advertisements must (either individually orcollectively) be served to users some given minimum number of timesduring a given time period, e.g., at least 10,000 impressions per monthfor the advertisements taken collectively.

The advertisements may be of different data formats, as appropriate forthe way in which the advertisements are output to users. For example, inone embodiment in which the primary provided content is streamed audio,the data formats of the advertisements include audio advertisements,which are inserted into the audio stream and output through the speakersof the client device 110 that receives the audio stream, and imageadvertisements, which are displayed by the application 111 on the screenof that client device.

An application 111 of a client device 110 sends a request 410 forcontent to the content server system 100, such as to initiate an audiostream. For example, the application 111 might request contentexplicitly requested by the user (e.g., a particular song, or songsassociated with a particular artist), or automatically request contentthat a user registered with the application 111 is expected toappreciate (e.g., a particular song, or songs associated with aparticular artist, that the user selected in the past, or songsdetermined to be similar in some way to those selected in the past). Thecontent server system 100 accordingly provides 412 the content to theclient device 110, such as by beginning to stream the requested audio.The application 111A accordingly outputs 413 the content in a mannerappropriate for the type of content. For example, for audio streamcontent, the application 111A causes the client device 110A to send theaudio of the stream to its speakers or other sound output port ordevice.

During some time period after the request 410 for content, the contentserver system 100 identifies attributes of the audience of the providedcontent. In particular, the content server system 100 estimates 415 asize of the audience of the provided content, including estimating anumber of inferred listeners. In some embodiments, the content serversystem 100 itself estimates 415 the size of the audience by receivingdata, such as sensor data, from the client devices 110 and performingthe estimation based on the received data. In other embodiments, theestimation 415 is accomplished by applications 111 of the client devices110 locally performing audience size estimation, and the content serversystem 100 receiving those estimates.

Examples of different techniques for estimating the size of the audienceare provided below with respect to FIGS. 5A-5D. Although not depicted inFIG. 4, the estimation 415 may be performed many times for particularrequested content (e.g., during the existence of a requested audiostream), such as directly before the content server system provides anadvertisement to client devices 100, or at predetermined time intervals.

The content server system 100 also determines 417 characteristics of theaudience as a whole (in addition to its size), as described above withrespect to the audience characteristics module 315 of FIG. 3.

The content server system 100 also selects 420 one or moreadvertisements to include within the provided content (e.g., an audiostream), as discussed above with respect to the ad selector module 320of FIG. 3.

The content server system 100 then provides 425 the selectedadvertisements, as described above with respect to the ad serving module325 of FIG. 3.

The content server system 100 also logs 430 an impression count based onthe providing of the selected advertisements at step 425 and theaudience size estimated at step 415, as discussed above with respect tothe ad impression logging module 330 of FIG. 3.

In one embodiment, the content server system 100 also provides 435 areport of the advertising statistics to the advertiser, as discussedabove with respect to the reporting module 340 of FIG. 3.

FIGS. 5A-5D illustrate the interactions involved in a number ofdifferent techniques for estimating the number of inferred listeners inan audience as in step 415 of FIG. 4, according to various embodiments.The techniques are performed by the audience size estimation modules 310of the content server system 100 and/or 262 of the applications 111.

FIG. 5A illustrates interactions involved when estimating the number ofinferred listeners based on similar location-related data of clientdevices, according to one embodiment. Content (e.g., an audio stream) isprovided 412 to a first client device 110A with a first application111A, e.g., at the request of a first user. Additionally, one or moreother, different users have additional client device 110B (e.g.,smartphones) with corresponding one or more additional applications111B, although they are not currently receiving the content provided tothe first client device 110A.

In one embodiment, the applications 111A, 111B all provide 505 thecontent server system 100 with location-related data indicating thelocations of the client devices, such as geolocation information (e.g.,GPS coordinates), and/or network-related data (e.g., identifier of aWifi or other local network to which the devices are connected). Thecontent server system 100 then analyzes 510 the location-related dataprovided by the applications 111A, 111B, and determines based on thedata that their respective client devices 110A, 110B are likely insufficiently close proximity that the users of those client devices canall hear the content when output by the first client device and thusconstitute a single audience. In one embodiment, the analysis 510includes determining that the geolocation information indicateslocations within some threshold distance of each other (e.g., 25 meters)that is sufficiently small that the sound of the audio stream producedby the client device 110A would likely be audible to users of the clientdevices 110B. In one embodiment, the analysis 510 additionally and/oralternatively includes determining that network-related data indicatethat the devices 110A, 110B are on the same local network, and hence arelikely close enough that their respective users would constitute asingle audience.

In another embodiment (not illustrated in FIG. 5A), the analysis 510 isperformed locally by the applications 111, rather than by the serversystem. For example, the applications 111 could locally broadcast theirlocation-related data, and the application 111A of the first clientdevice 110A receiving the audio stream could receive thelocation-related data of the applications 111B. The application 111Acould then determine (using the same analysis as that described directlyabove as being performed by the content server system 100) that theother client devices 110B are sufficiently close that their users shouldbe included within a single audience. The application 111A would thensend a notification to the server system 100 that there are otherinferred listeners (i.e., the users of the other client devices 110B)associated with the audio stream being provided to the first clientdevice 110A.

With the audience size estimated based on the analysis 510, the contentserver system 100 provides 425 an advertisement and logs 430 theimpression counts associated with the advertisement, as described abovewith respect to FIG. 4.

FIG. 5B illustrates interactions involved when estimating the number ofinferred listeners based on an explicit sharing of a stream, accordingto one embodiment. A first application 111A of a first client device110A makes a request 410 for a content (e.g., audio) stream, e.g., atthe request of a first user of the first client device, and the serversystem begins 512 sending the stream to the application 111A.Additionally, one or more other, different users have additional clientdevices 110B with corresponding applications 111B, although they are notcurrently receiving the content provided to the first client device110A.

The first user, knowing that the other users are nearby and listening tothe same stream, and wishing the second users to be able to interactwith the stream, explicitly requests 515 sharing the stream with othernearby users (e.g., using the option 257 of FIG. 2B). As a result, theapplication 111A causes the client device 110A to send a message (e.g.,via broadcast) to any nearby client devices 110 notifying 520 the clientdevices that the stream is available for use by others.

The applications 111B of the other client devices 110B detect 525 themessage and inform their corresponding users of the availability of thestream. For example, in one embodiment the applications 111B updatetheir user interfaces to include a description of the stream, such as ausername or other identifier of the first user (the “owner” of thestream) and/or of the client device 110A which is receiving the stream.The other users then elect to join 530 the stream, e.g., using a userinterface provided by the applications 111B for that purpose (such as a“Join” option associated with a description of the stream). In oneembodiment, joining the stream causes the applications 111B tosupplement their user interfaces with controls related to the stream,such as options to indicate approval (e.g., a “thumbs up” control) ofthe stream or of a song within the stream, to share the stream, and thelike.

In response to the additional users joining the stream, the applications111B accordingly notify 535 the server system 100 that there areinferred listeners associated with the stream. Accordingly, when theserver system 100 provides 425 an advertisement, it logs 430 theimpression counts for the advertisement to include the inferredlisteners (i.e., the additional users), as well as the direct listener(the first user).

It is appreciated that the operations of FIG. 5B could be done bydifferent parties than is illustrated, as would be understood by one ofskill in the art. For example, rather than directly notifying 535 thecontent server system 100 of the additional indirect listener, theapplications 111B could notify the client application 111A that theusers have joined, and the application 111A could update its userinterface to reflect the additional users and could notify 535 thecontent server system 100.

FIG. 5C illustrates interactions involved when estimating the number ofinferred listeners based on analysis of user reactions, according to oneembodiment. A first application 111A of a first client device 110A(e.g., a smartphone) makes a request for the content (e.g., an audiostream), e.g., at the request of a first user of the first clientdevice, and the server system begins 412 sending the content to theapplication 111A. Additionally, additional users have additional clientdevices 110B (e.g., smartphones) with a corresponding additionalapplications 111B, although they are not currently receiving the contentprovided to the first client device 110A.

The first and additional users hear the content (e.g., audio stream ofmusic) and react to the content in related ways, such as moving,consciously or subconsciously, in time with the music. Sensors of theclient devices 110A, 110B (e.g., accelerometers detecting the movement)measure 505, 506 the reaction, thereby producing reaction data.

In one embodiment, at some point the application 111A detects 510 thepresence of the client devices 110B, e.g., by explicitly sendingbroadcast requests via short-range wireless communication for otherinstances of the application 111, or by receiving broadcasts from theother applications 111B that note the presence of the client devices110B. The application 111A sends its own reaction data 515 to thecontent server system 100, and also requests 520 the applications 111Bto send the reaction data of step 506 to the content server. Theapplications 111B accordingly sends 525 their reaction data to thecontent server system. The content server system then analyzes 530 thereaction data, determining a degree of similarity between them. Forexample, the degree of similarity of movement could be determined bycomparing the magnitude of the movements and the times at which themovements take place. In one embodiment, if the content server system100 determines that there is sufficient similarity between the reactiondata from the client device 110A and the client devices 110B, then itdetermines that the users of the various client devices are listening tothe same content and therefore that there are inferred listeners(namely, the users of the client devices 110B) for that content.

In a further embodiment, the content server system 100 additionallycorrelates the reactions described by the reaction data to knowninformation about the provided content. In this embodiment, for example,in order for the second user to be considered an inferred listener, inaddition to the reaction second of the second client device 110B havingat least a threshold degree of similarity to that of the first clientdevice 110A, the reactions must have some correlation with significantmoments in the content, such as movement reaction data occurring withina threshold time of significant changes of volume or pitch in audiostream content (indicating users moving in time to music within theaudio stream).

Advertisements are then provided 425 and impression counts logged 430 toreflect the inferred listeners. In another embodiment, analysis isperform by the client devices 110, with the application 111A (or 111B)obtaining the reaction data for all of the client devices 110 andperforming the analysis as described above at step 530, and notifyingthe content server system 100 of a number of inferred users (if any)detected based on the analysis.

FIG. 5D illustrates interactions involved when estimating the number ofinferred listeners based on explicit inducement of a user reaction, evenin the absence of more than one client device 110, according to oneembodiment. A first application 111A of a first client device 110A(e.g., a smartphone) makes a request for the content (e.g., an audiostream), e.g., at the request of a first user of the first clientdevice, and the server system begins 412 sending the stream to theapplication 111A.

The content server system 100 sends 505 a stimulant message to theapplication 111A, the stimulant message intended to elicit a measurablereaction from the first user and from any other nearby listeners. Thestimulant message could be an independent message, or part of anadvertisement. For example, in one embodiment in which the providedcontent is an audio stream, the stimulant message is audio inserted intothe audio stream that induces the audience members to make an audiblevocal response, such as “Cheer if you like X,” where X is some product,concept, or the like.

The application 111A accordingly outputs the stimulant message (e.g.,playing the audio “Cheer if you like X”). The users in the audience thenmay react 515, such as by cheering. The application 111A measures 520the reaction using its sensors, such as its microphone input.

In one embodiment, the application 111A sends 525 the measured reactiondata to the content server system 100 for analysis, and the contentserver system analyzes 530 the reaction data to estimate a number ofdistinct people represented by the reaction data. For example, in oneembodiment the content server system analyzes human voice audioinput—such as that resulting from user reactions in the “Cheer if youlike X” example—to identify a number of distinct voices, e.g., usingauditory scene analysis algorithms such as prediction-driven approaches.The content server system 100 then updates the inferred listeners countassociated with the content (e.g., audio stream) to N, assuming that(N+1) distinct people were estimated to be in the audience, one of whomis presumably the direct listener. In a different embodiment, theanalysis 530 is performed on the client device 110A by the application111A, and the application sends to the content server system 100 anupdated count of inferred listeners associated with the content based onits analysis 530.

In one embodiment, steps 515 and 520 may take place without priorsending 505 and outputting 510 of a stimulant message. For example, theapplication 111A may also perform identification of distinct voices inconversations that occur naturally and spontaneously, such as thoseinvolved in a conversation between friends.

FIG. 6 is a high-level block diagram illustrating physical components ofa computer 600 used as part or all of the content server system 100 orclient device 110 from FIG. 1, according to one embodiment. Illustratedare at least one processor 602 coupled to a chipset 604. Also coupled tothe chipset 604 are a memory 606, a storage device 608, a keyboard 610,a graphics adapter 612, a pointing device 614, and a network adapter616. A display 618 is coupled to the graphics adapter 612. In oneembodiment, the functionality of the chipset 604 is provided by a memorycontroller hub 620 and an I/O controller hub 622. In another embodiment,the memory 606 is coupled directly to the processor 602 instead of thechipset 604.

The storage device 608 is any non-transitory computer-readable storagemedium, such as a hard drive, compact disk read-only memory (CD-ROM),DVD, or a solid-state memory device. The memory 606 holds instructionsand data used by the processor 602. The pointing device 614 may be amouse, track ball, or other type of pointing device, and is used incombination with the keyboard 610 to input data into the computer 600.The graphics adapter 612 displays images and other information on thedisplay 618. The network adapter 616 couples the computer 600 to a localor wide area network.

As is known in the art, a computer 600 can have different and/or othercomponents than those shown in FIG. 6. In addition, the computer 600 canlack certain illustrated components. In one embodiment, a computer 600acting as a server may lack a keyboard 610, pointing device 614,graphics adapter 612, and/or display 618; similarly, a computer 600acting as a smartphone may lack a keyboard 610 or external pointingdevice 614, for example. Moreover, the storage device 608 can be localand/or remote from the computer 600 (such as embodied within a storagearea network (SAN)).

As is known in the art, the computer 600 is adapted to execute computerprogram modules for providing functionality described herein. As usedherein, the term “module” refers to computer program logic utilized toprovide the specified functionality. Thus, a module can be implementedin hardware, firmware, and/or software. In one embodiment, programmodules are stored on the storage device 608, loaded into the memory606, and executed by the processor 602.

Other Considerations

The present invention has been described in particular detail withrespect to one possible embodiment. Those of skill in the art willappreciate that the invention may be practiced in other embodiments.First, the particular naming of the components and variables,capitalization of terms, the attributes, data structures, or any otherprogramming or structural aspect is not mandatory or significant, andthe mechanisms that implement the invention or its features may havedifferent names, formats, or protocols. Also, the particular division offunctionality between the various system components described herein ismerely for purposes of example, and is not mandatory; functionsperformed by a single system component may instead be performed bymultiple components, and functions performed by multiple components mayinstead performed by a single component.

Some portions of above description present the features of the presentinvention in terms of algorithms and symbolic representations ofoperations on information. These algorithmic descriptions andrepresentations are the means used by those skilled in the dataprocessing arts to most effectively convey the substance of their workto others skilled in the art. These operations, while describedfunctionally or logically, are understood to be implemented by computerprograms. Furthermore, it has also proven convenient at times, to referto these arrangements of operations as modules or by functional names,without loss of generality.

Unless specifically stated otherwise as apparent from the abovediscussion, it is appreciated that throughout the description,discussions utilizing terms such as “determining” or “displaying” or thelike, refer to the action and processes of a computer system, or similarelectronic computing device, that manipulates and transforms datarepresented as physical (electronic) quantities within the computersystem memories or registers or other such information storage,transmission or display devices.

Certain aspects of the present invention include process steps andinstructions described herein in the form of an algorithm. It should benoted that the process steps and instructions of the present inventioncould be embodied in software, firmware or hardware, and when embodiedin software, could be downloaded to reside on and be operated fromdifferent platforms used by real time network operating systems.

The present invention also relates to an apparatus for performing theoperations herein. This apparatus may be specially constructed for therequired purposes, or it may comprise a general-purpose computerselectively activated or reconfigured by a computer program stored on acomputer readable medium that can be accessed by the computer. Such acomputer program may be stored in a non-transitory computer readablestorage medium, such as, but is not limited to, any type of diskincluding floppy disks, optical disks, CD-ROMs, magnetic-optical disks,read-only memories (ROMs), random access memories (RAMs), EPROMs,EEPROMs, magnetic or optical cards, application specific integratedcircuits (ASICs), or any type of computer-readable storage mediumsuitable for storing electronic instructions, and each coupled to acomputer system bus. Furthermore, the computers referred to in thespecification may include a single processor or may be architecturesemploying multiple processor designs for increased computing capability.

The algorithms and operations presented herein are not inherentlyrelated to any particular computer or other apparatus. Variousgeneral-purpose systems may also be used with programs in accordancewith the teachings herein, or it may prove convenient to construct morespecialized apparatus to perform the required method steps. The requiredstructure for a variety of these systems will be apparent to those ofskill in the art, along with equivalent variations. In addition, thepresent invention is not described with reference to any particularprogramming language. It is appreciated that a variety of programminglanguages may be used to implement the teachings of the presentinvention as described herein, and any references to specific languagesare provided for invention of enablement and best mode of the presentinvention.

The present invention is well suited to a wide variety of computernetwork systems over numerous topologies. Within this field, theconfiguration and management of large networks comprise storage devicesand computers that are communicatively coupled to dissimilar computersand storage devices over a network, such as the Internet.

Finally, it should be noted that the language used in the specificationhas been principally selected for readability and instructionalpurposes, and may not have been selected to delineate or circumscribethe inventive subject matter. Accordingly, the disclosure of the presentinvention is intended to be illustrative, but not limiting, of the scopeof the invention, which is set forth in the following claims.

What is claimed is:
 1. A computer-implemented method, comprising:identifying a streaming client device receiving streamed audio;determining a number of inferred listeners that are listening to thestreamed audio when output by the streaming client device, the inferredlisteners being in addition to a user using of the streaming clientdevice; including an advertisement in the streamed audio provided to thestreaming client device; and logging an impression count responsive tothe inclusion of the advertisement, the impression count reflecting thedetermined number of inferred listeners.
 2. The computer-implementedmethod of claim 1, wherein determining the number of inferred listenerscomprises: receiving first location-related data associated with thestreaming client device; receiving additional location-related dataassociated with additional client devices other than the streamingclient device; determining, based on the first location-related data andthe additional location-related data, that a user of the streamingclient device and users of the additional client devices can all hearaudio output of the streaming client device; and responsive todetermining that the user of the streaming client device and users ofthe additional client devices can all hear audio output of the streamingclient device, determining the number of inferred listeners based on anumber of the additional client devices.
 3. The computer-implementedmethod of claim 1, wherein determining the number of inferred listenerscomprises: receiving first data derived from sensors of the streamingclient device; receiving additional data derived from sensors ofadditional client devices other than the streaming client devices;determining, based on the first data of the streaming client device andof the additional client devices, that a user of the streaming clientdevice and users of the additional client devices are behaving in asimilar manner; and responsive to determining that the user of thestreaming client device and the users of the additional client devicesare behaving in a similar manner, determining the number of inferredlisteners based on a number of the additional client devices.
 4. Thecomputer-implemented method of claim 1, wherein determining the numberof inferred listeners comprises: providing a stimulant message to thestreaming client device, the stimulant message prompting users to reactin a specified manner; receiving, from the streaming client device,reaction data measuring user reaction to the stimulant message;estimating a number of distinct people represented by the reaction data;and responsive to the estimation, determining the number of inferredlisteners based on the estimated number of distinct people.
 5. Thecomputer-implemented method of claim 1, wherein determining the numberof inferred listeners comprises identifying a number of distinct humanvoices in audio input data of the streaming client device.
 6. Thecomputer-implemented method of claim 1, further comprising: selectingthe advertisement that is included in the streamed audio based at leastin part on the inferred listeners.
 7. The computer-implemented method ofclaim 6, further comprising: determining that there is a plurality ofinferred users; identifying a plurality of languages spoken among theinferred users and the user using the streaming client device;identifying a primary language of a country in which the streamingclient device is located; and responsive to identifying the plurality oflanguages, selecting, as the advertisement that is included in thestreamed audio, an advertisement in the primary language.
 8. Thecomputer-implemented method of claim 1, wherein determining the numberof inferred listeners comprises: determining whether audio output of thestreaming client device is being provided via headphones; and responsiveto determining that the audio output of the streaming client device isbeing provided via headphones, determining that there are no inferredlisteners.
 9. A computer-implemented method performed by a clientdevice, comprising: receiving streamed audio from a content serversystem; outputting the streamed audio; determining a number of inferredlisteners that are listening to the streamed audio output by the clientdevice, the inferred listeners being in addition to a user using theclient device; and sending, to the content server system, an audiencecount that is based on the inferred number of listeners.
 10. Thecomputer-implemented method of claim 9, comprising: receiving anadvertisement from the content server system that is selected based atleast in part on the inferred listeners.
 11. The computer-implementedmethod of claim 9, comprising: receiving location-related dataassociated with additional client devices; determining, based onlocation-related data of the client device, and on the location-relateddata associated with the additional client devices, that a user of theclient device and users of the additional client devices can all hearaudio output of the client device; and responsive to determining thatthe user of the client device and users of the additional client devicescan all hear audio output of the client device, notifying the contentserver system of a presence of inferred listeners.
 12. Thecomputer-implemented method of claim 9, comprising: receiving a requestfrom a user of the client device to share the streamed audio with nearbyusers; and responsive to receiving a notification that a nearby user hasjoined the audio stream that was shared, notifying the content serversystem of a presence of an additional inferred listener.
 13. Thecomputer-implemented method of claim 9, comprising: receiving, fromadditional client devices via short-range wireless communications, dataderived from sensors of the additional client devices; determining,based on data derived from sensors of the client device and on the dataderived from sensors of the additional client devices, that a user ofthe client device and users of the additional client devices arebehaving in a similar manner; and responsive to the determination,notifying the content server system of a presence of inferred listeners.14. The computer-implemented method of claim 9, comprising: receiving,within the streamed audio, a stimulant message prompting users to reactin a specified manner; obtaining reaction data by measuring userreaction to the stimulant message on the client device; estimating anumber of distinct people represented by the reaction data; andresponsive to the number being greater than 1, notifying the contentserver system of a presence of inferred listeners.
 15. Acomputer-readable storage medium comprising computer programinstructions executable by a processor, the instructions comprising:instructions for identifying a streaming client device receivingstreamed audio; instructions for determining a number of inferredlisteners that are listening to the streamed audio when output by thestreaming client device, the inferred listeners being in addition to auser using of the streaming client device; instructions for including anadvertisement in the streamed audio provided to the streaming clientdevice; and instructions for logging an impression count responsive tothe inclusion of the advertisement, the impression count reflecting thedetermined number of inferred listeners.
 16. The computer-readablestorage medium of claim 15, wherein determining the number of inferredlisteners comprises: receiving first location-related data associatedwith the streaming client device; receiving additional location-relateddata associated with additional client devices other than the streamingclient device; determining, based on the first location-related data andthe additional location-related data, that a user of the streamingclient device and users of the additional client devices can all hearaudio output of the streaming client device; and responsive todetermining that the user of the streaming client device and users ofthe additional client devices can all hear audio output of the streamingclient device, determining the number of inferred listeners based on anumber of the additional client devices.
 17. The computer-readablestorage medium of claim 15, wherein determining the number of inferredlisteners comprises: receiving first data derived from sensors of thestreaming client device; receiving additional data derived from sensorsof additional client devices other than the streaming client devices;determining, based on the first data of the streaming client device andof the additional client devices, that a user of the streaming clientdevice and users of the additional client devices are behaving in asimilar manner; and responsive to determining that the user of thestreaming client device and the users of the additional client devicesare behaving in a similar manner, determining the number of inferredlisteners based on a number of the additional client devices.
 18. Thecomputer-readable storage medium of claim 15, wherein determining thenumber of inferred listeners comprises: providing a stimulant message tothe streaming client device, the stimulant message prompting users toreact in a specified manner; receiving, from the streaming clientdevice, reaction data measuring user reaction to the stimulant message;estimating a number of distinct people represented by the reaction data;and responsive to the estimation, determining the number of inferredlisteners based on the estimated number of distinct people.
 19. Thecomputer-readable storage medium of claim 15, further comprising:selecting the advertisement that is included in the streamed audio basedat least in part on the inferred listeners.
 20. The computer-readablestorage medium of claim 15, wherein determining the number of inferredlisteners comprises: determining whether audio output of the streamingclient device is being provided via headphones; and responsive todetermining that the audio output of the streaming client device isbeing provided via headphones, determining that there are no inferredlisteners.